{  > 
(• 

S£ 


t 


MEMORANDUM 

BM-53S7-FB 

NOVEMBER  1987 

fljk  jtkt 

00 


<0 

<o 

Q  DIGITAL  COMPUTER  SIMULATION: 

<  STATISTICAL  CONSIDERATIONS 

George  S.  Fishman  and  Philip  J.  Kiviat 


PREPARED  FOR: 

UNITED  STATES  AIR  FORCE  PROJECT  RAND 


R-priD 


@0*fStM4l&6K 


CltiRiNGMOUSE 


SANTA  MONICA  •  CALIFORNIA 


MEMORANDUM 

RM-5387-PR 

NOVEMBEH  IS87 


DIGITAL  COMPUTER  SIMULATION: 
STATISTICAL  CONSIDERATIONS 

George  S.  Fiehman  and  Philip  J.  Kiviat 


This  rtneareh  is  by  the  l  oited  States  Air  Foree  tinder  Projrr l  RAND-Con- 

Irart  \o.  H  t620.ft7-C-00  !5  monitored  by  lhr  Directorate  of  Ofieralioita!  Requirements 
ami  Detrlopmenl  Plans.  Deputy  Chief  of  Staff.  Research  and  Oe\eiopmeni.  Hq  ISAF. 
Views  or  « (inclusions  contained  in  this  Memorandum  should  not  be  interpreted  as 
represent itl£  the  oflieinl  opinion  or  policy  of  the  foiled  States  Air  Forre. 

DISTRIBUTION  STATEMENT 

Distribution  of  this  document  is  unlimited. 


Tfe 


R-f)  n 


«»$o  **>•«  it  •  ttj»u  aOhi(. 


•  t*t!iOt»'»  >  •  «**> 


-iii- 


? REF AGE 

This  is  one  of  a  series  of  RAND  Keasorauda  oa  digital  computer 
Simulation,  Preceding  work  or,  this  subject  has  been  described  in 
C.  S.  Fishman,  Digital  Computer  Simulation;  The  Allocation  of  Computer 
Time  la  Comparing,  Experiments,  The  RAND  Corporation,  SM-52S8- 1-PR, 
October  1967,  and  P,  J,  Kiviat,  Digital  Computer  Simulation:  Modeling 
Concepts.  The  RAND  Corporation,  BM-5378-PR,  September  1967.  The  purpose 
of  this  Memorandum  is  to  describe  a  number  of  statistical  problems  that 
materialize  during  computer  simulation  experiments.  The  Meraoranduri 
gives  references  (when  they  exist)  that  will  assist  an  experimenter  in 
resolving  these  problems. 
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StfMMARY 

Th is  Memorandum  describes  &  number  of  statistical  problems  that 
arise  to  computer  simulation  experiments.  Failure  to  resolve  these 
problems  adequately  can  significantly  degrade  the  value  of  experimental 
results.  References  are  given  that  should  assist  an  experimenter  in 
handling  them. 

The  Memorandum  describes  three  principal  problem  areas;  verifica¬ 
tion,  validation,  and  problem  analysis.  Verification  insures  that  a 
simulation  model  containing  a  mathematical  structure  and  a  data  base 
behaves  as  an  experimenter  intends.  The  complexity  of  models  often 
makes  it  difficult  to  determine  whether  their  basic  operating  assumptions 
are  satisfied. 

Validation  teste  the  agreement  between  the  behavior  of  a  simulation 
model  and  the  observed  behavior  of  a  real  system.  This  requires  empiri¬ 
cal  data.  If  a  behavioral  equivalence  can  be  established  between  a  simu¬ 
lation  taodel  and  a  real  system,  we  may  regard  the  behavior  of  the  model 
and  the  system  as  being  consistent.  Since  a  simulation  model  is  often 
exercised  with  modifications  that  do  not  cuv.ontly  exfat  in  a  real  sys¬ 
tem,  it  is  Important  that  a  benchmark  of  consistency  be  established 
whenever  possible  to  provide  confidence  for  extrapolations. 

Problem  analysis  embraces  a  host  of  statistical  problems  relating 
to  the  collection,  reduction,  and  presentation  of  data  generated  by 
computer  simulation.  The  choice  of  sampling  interval,  the  use  of  vari¬ 
ance  reduction  techniques,  and  the  estimation  of  reliability  are  prob¬ 
lems  common  to  all  simulation  experiments  containing  random  phenomena. 
These  and  similar  problems  are  considered  and  references  are  given  to 
discussions  and  solutions. 
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I.  xntrodugtiqh 


Many  system  simulation  experiments  are  driven  by  Input  processes 
containing  elements  of  random  behavior.  In  such  simulations,  st&tiati- 
cat  reliability  *ust  be  considered  i£  experimental  result®  are  to  be 
interpreted  properly.  Statistical  considerations  also  eater  into  the 
evaluation  of  simulation  model  design*.  This  Itearioraftdoa  describes 
these  considerations,  identifying  how  and  where  they  become  important 
during  the  planning,  performance,  and  analysis  of  simulation  experiments. 
The  description  can  be  viewed  as  tracing  the  elements  of  a  typical  experi¬ 
ment  from  Inception  through  analysis,  defining  statistical  problems  and 
relating  them  to  the  formal  body  of  statistical  theory. 

The  problems  described  are  inherent  in  all  stochastic  system  simu¬ 
lation  models.  An  experimental  design's  ability  to  reveal  useful  in¬ 
sights  into  a  system  depends  to  a  great  extent  on  how  well  these  prob¬ 
lem*  are  solved.  Failure  to  deal  with  them  may  cause  errors  in  inter¬ 
preting  observed  associations  between  system  input  and  output.  One 
common  error  is  the  underestimation  of  the  reliability  of  system  response 
measurements,  caused  by  failure  to  account  for  autocorrelation  in  system 
response  time  series  generated  by  a  simulation  model.  Another  frequent 
source  of  error  is  the  assumption  that  random  numbers  generated  within 
a  simulation  model  are  independent,  when  in  fact  the  method  of  random 
number  generation  employed  induces  unwanted  correlation. 

Our  aim  is  to  promote  awareness  of  problems,  not  to  solve  them. 

The  study  offer*  no  general  solutions,  but  provides  references  germane 
to  the  statistical  problems  described.  Some  references  describe  parti¬ 
cular  solutions;  others  offer  methods  of  analysis. 

To  understand  the  role  of  statistics  in  system  simulation  experi¬ 
ments,  a  knowledge  of  bow  these  experiments  developed  is  helpful. 

System  simulation  may  be  regarded  as  an  extension  of  Monte  Carlo  methods. 
These  methods,  which  concern  experiments  with  random  numbers,  began  their 
systematic  development  during  World  War  II  when  they  were  applied  to 
problems  related  to  the  atomic  bomb.  The  work  involved  dt-ect  simula¬ 
tion  of  probabilistic  problems  concerned  with  random  neutron  diffusion 


in  fissionable  material  [ilJ.  Shortly  thereafter,  it  was  proposed 
that  Monte  Carlo  methods  be  applied  to  solve  certain  integral  equations, 
occurring  In  physics,  that  were  not  ssitenable  to  analytical  solution. 
Stochastic  processes  often  existed  whose  parameters  satisfied  these 
equations.  One  could  estimate  these  parameters  (and  hence  the  solution 
to  the  equations)  by  performing  Monte  Carlo  experiments  on  the  stochas¬ 
tic  processes. 

the  reliability  of  parameter  estimates  was  the  dominant  statisti¬ 
cal  problem  in  these  Monte  Carlo  experiments.  Since  the  estimates 

were  generally  the  sum  of  independent,  identically  distributed  random 

k 

variables,  their  reliability  was  inversely  proportional  to  n  --a  10- 
percent  improvement  in  reliability  required  a  100-fold  increase  in 
cample  size.  For  many  problems,  random  sampling  was  prohibitively 
expensive  even  with  digital  computers.  The  crucial  statistical  prob¬ 
lem  was  finding  ways  of  reducing  the  variance  of  an  estimator  for  a 
given  sample  size.  A  number  of  these  variance  reduction  methods  are 
described  in  [23],  A  particularly  useful  variance  reduction  technique 
known  as  the  methoo  of  antithetic  variates  is  described  in  Hammers  ley 
and  Hands  comb  [11]. 

The  concept  of  system  simulation  became  a  reality  in  the  early 
195G's,  when  there  was  a  shift  in  emphasis  from  looking  at  parts  of  a 
problem  to  examining  the  simultaneous  interactions  of  all  parts.  This 
shift  was  at  least  partially  due  to  the  fact  that  system  simulation 
experiments  had  become  feasible  on  digital  computers,  which  were  under¬ 
going  order-of-ewgnitude  advances  in  speed.  Simulation  made  it  possible 
to  carry  out  fully  integrated  system  analyses  which  were  generally  far 
too  complex  to  be  carried  out  analytically.  This  was  especially  true 
for  studies  of  the  interactions  among  parts  of  a  system. 

In  the  past  decade,  the  ability  to  model  complex  systems  has  greatly 
improved.  Specialized  computer  simulation  languages  such  as  GPSS, 
SIMSCRIPT  and  SIMULA  offer  convenient  formats  for  describing  system 
problems.  Along  with  the  improvements,  however,  have  come  a  number  of 
statistical  problems,  few  of  which  have  been  satisfactorily  solved.  In 
fact,  some  of  them  have  not  even  been  recognized  yet  as  serious  problems. 


Verification,  validation,  and  problem  analysis  are  tasks  demanding 
careful  statistical  consideration.  Verification  determines  whether  a 
model  with  a  particular  mathematical  structure  an d  data  base  actually 
behaves  as  an  experimenter  assumes  tc  does.  Validation  tests  whether 
a  simulation  model  reasonably  approximates  a  real  system.  Problem 
analysis  seeks  to  insure  the  proper  execution  of  the  simulation  and 
proper  handling  of  its  results;  consequently  it  deals  with  a  host  of 
matters;  the  concise  display  of  solutions,  efficient  allocation  of 
computer  time,  proper  design  of  tests  of  comparison,  and  correct 
estimates  of  sample  sizes  needed  for  specified  levels  of  accuracy. 

In  other  words,  verification  and  validation  insure  that  a  simula¬ 
tion  model  is  properly  designed;  only  after  a  model  has  been  verified 
and  validated  can  an  experimenter  justifiably  use  a  model  to  probe 
system  behavior.  Problem  analysis  mainly  deals  with  the  results  of 
experimental  probing. 

Of  the  remaining  sections  of  the  Memorandum,  Sec.  II  provides 
some  necessary  definitions  and  motivation,  Secs.  Ill  and  IV  discuss 
problems  associated  with  the  design  and  proof-testing  of  a  simulation 
model,  and  Sec.  V  considers  problems  associated  with  the  use  of  simu¬ 
lation  models.  The  format  of  the  last  three  sections  is:  presentation 
of  problems,  brief  discussion  of  advised  solutions,  references  to 
relevant  literature. 
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II.  SIHUIA-nOM  MODELS 

The  concepts  discussed  irons  hero  on  can  best  be  understood  in  the 
context  of  a  typical  simulation  sssdel,  This  section  defines  a  number 
of  terms  used  in  succeeding  sections,  examines  a  typical  model  to  show 
these  terms  in  their  proper  context,  sad  indicates  seas  problem  areas 
connected  with  model  structure  and  data  systems  that  should  concern 
every  model-builder. 

Every  simulation  model  comprises  two  systems  a  data  system 
and  a  logical  system.  Both  present  a  model-builder  with  problems; 
both  contribute  equally  to  the  validity  of  a  final  simulation  model. 

When  we  first  ioox  at  a  simulation  model  we  see  its  logical 
8tructu£e~-the  way  in  which  a  system’s  operations  have  been  analysed 
and  factored  into  discrete  units,  and  these  units  combined  so  that  the 
model  can  be  made  to  reproduce  the  system's  behavior.  When  we  look 
at  a  modal  more  deeply,  we  see  that  it  contains  sequences  of  data 
comparisons  and  logical  teats.  These  testa  cause  a  model  to  take 
different  actions  depending  on  numerical  values  that  are  either  input 
from  the  world  outside  its  boundaries  or  computed  within.  The  model's 
behavior  is  conditioned  by  these  data  values,  and  its  results  are 
sensitive  to  data  representations  and  methods  of  data  generation. 

Consider  the  simple  one-machine  shop  with  a  waiting  line,  shown 
in  Fig.  1.  Items  arrive  at  the  asschine  for  processing;  the  arrow 
coming  from  the  left  shows  the  jobs  arriving  with  average  arrival  rate 
X.  If  the  machine  is  free  when  a  job  arrives  it  immediately  begins 
service,  which  is  performed  at  an  average  service  rate  u.  A  job 
that  arrives  when  the  machine  is  engaged  waits  in  line  until  it  can 
be  processed.  The  waiting  line  is  pictured  as  a  box;  in  a  real 
system  it  might  be  a  tote  box  or  a  pile  of  partially  completed  parts. 
When  a  job  is  completed  it  leaves  the  service  facility  {arrow  going 
to  the  right),  freeing  the  machine  for  another  job.  If  jobs  are 
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waitlng  in  the  line  (queue)  ,  one  is  selected  for  service  according  to 
a  queue  disc inline  and  the  machine  is  engaged  again.  If  no  jobs  are 
waiting,  the  machine  remains  idle  until  the  next  job  arrival. 


Fig.  1  --  Simple  machine  shop  model 

Systems  such  as  this,  in  which  jobs  arrive,  possibly  wait  in 
queues,  and  are  serviced  are  called  queueing  systems .  Almost  all 
simulation  models  have  queueing  systems  imbedded  in  them. 

Simulating  a  system  like  the  one  described  requires  the  definition 
of  events  that  take  place  durLng  its  operation.  Events  occur  at  point? 
in  time  when  system  activities  begin  and  end;  if  an  activity  has  no 
duration,  e.g.  ,  a  decision  made  at  an  instant  in  time,  it  only  has  one 
related  event. 

A  gross  representation  of  the  logical  structure  of  a  queueing 
system  is  shown  in  Fig.  2,  The  activities  pictured  are  jobs  arriving 
and  jobs  being  servLced.  Jobs  arrive  at  the  shop  at  random  times.  Let 
the  first  N  jobs  that  arrive  be  denoted  j^,  Js»  and  their 

arrival  times  be  denoted  tj,  t2,  tjj.  Then  the  times  between  job 

arrivals  are;  *  tQ)  ,  -  (t2  -  tj)  ,  . ...  -  (t^  -  t^)  . 

Inputs  to  the  queueing  system  are  simulated  by  generating  lob  arrivals 
at  the  service  facility;  Interarrival  times  rather  than  arrival  times 
are  usually  used.  When  a  job  arrives,  the  time  when  the  next  Job  will 
arrive  is  computed  by  random  sampling  from  an  interarrivai  time  distri¬ 
bution  .  Two  data  problems  associated  with  this  simulation  are  deter¬ 


mining  the  correct  statistical  sampling  distribution  and  generating 


random  samples  from  it.  Section  IV  discusses  some  problems  concerned 
with  selecting  a  sampling  distribution.  Methods  for  generating  random 
samples  from  various  statistical  distributions  can  be  found  In  [A] 
and  [24] . 


ARRIVAL  ACTIVITY 


SERVICE  ACTIVITY 


c 


Job  arrives 


Job 
enters  queue 
If  machine 
is  busy 


Fig.  2  --  Basic  queueing  model 

A  sequence  of  job  arrival  times  constitutes  a  sample  from  a 
simulation  input  process.  Each  arrival  generates  an  interarrival 
time  for  the  next  job  and  a  service  time  for  itseif.  Figure  3  illus¬ 
trates  the  arrival  event  Ln  some  detail  showing  the  sequence  of  simu¬ 
lation  activities:  the  generation  of  an  interarrival  time  and  a 
service  tine,  and  placement  of  a  new  arrival  in  process  or  in  queue.* 
When  a  job  arrives  it  is  placed  in  service  if  the  server  is  free; 
otherwise,  it  is  placed  in  queue.  Gall  the  service  times  for  the  N 
jobs  that  enter  the  shop  s^,  $2,  Sjj.  The  sequence  of  service 

times  also  constitutes  s  simulation  input  process,  as  random  samples 
are  drawn  from  some  service  time  distribution  whenever  a  job  is  pro¬ 
cessed.  For  each  job  that  passes  through  the  shop,  two  (random)  quan¬ 
tities  must  be  determined  --  and  a, .  The  quantity  determines 

The  notation  used  in  Fig.  3  is  taken  from  P .  J.  Kiviat,  Digital 
Computer  Simulation:  Modeling  Concepts,  RM-5378-PR  ,  September  1967. 
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All  simulation  models  are  driven  by  some  basic  force ,  generally 
the  arrival  of  a  task *  job ,  or  request  of  soma  sort  in  the  simulated 
system.  Each  job’s  progress  through  the  system  is  determined  by  two 
sets  of  factors:  its  characteristics*  and  pressures  exerted  by  the 
system.  Job  characteristics  can  be  few  or  many;  in  our  simple  model 
there  are  two*  an  arrival  time  and  a  service  time.  These  characteris¬ 
tics  can  be  generated  at  one  time  or  at  different  stages  in  a  job's 
life  as  it  passes  through  &  simulated  system.  Regardless  of  where 
they  are  generated,  they  belong  to  a  job  and  contribute  to  its  simu¬ 
lated  behavior. 

A  job  with  n  characteristics  can  be  described  by  a  list  of  these 
characteristics  which  we  call  an  a- tuple.  A  job  in  our  queueing  model 
is  characterised  by  a  2-tuple  (d^ ,  s^) .  A  typical  problem  encountered 
when  constructing  a  simulation  model  is  the  generation  of  job  charac¬ 
teristics;  an  important  problem  encountered  while  checking  out  a  simu¬ 
lation  model  Is  the  examination  of  a  sequence  of  generated  lob  charac¬ 
terisations  .  as  we  call  these  n- tuples. 

As  Fig.  3  shows,  a  Job  does  not  necessarily  have  to  pass  directly 
through  the  shop;  it  can  wait  in  line  while  other  jobs  are  being  pro¬ 
cessed.  If  denotes  the  time  tha-  Job  1  leaves  the  shop,  then 
Wi  *  Ti  “  fcl  ~  9 1  is  tIlB  t*®6  spends  waiting  for  service.  The 
sequences  T^,  T^,  ...»  and  w^,  w^,  . . . ,  are  simulation  output 
processes .  sequences  of  variables  whose  values  are  determined  by  the 
activities  that  take  place  within  the  simulation  model.  If  the  model 
is  designed  so  that  curtain  jobs  have  priority  over  others,  then  low- 
priority  jobs  will  have  long  waiting  times;  if  it  is  designed  with  a 
service  facility  that  shuts  down  periodically  for  repairs  and  rest  peri¬ 
ods,  then  the  sequence  of  jobs  that  exit  from  the  shop  will  reflect  this. 

A  simulation  model  is  designed  to  generate  output  processes  that 
can  be  studied  to  observe  a  system's  behavior  as  its  data  and/or 
logical  structure  are  changed.  Data  influence  a  model  through  the 
selection  of  statistical  sampling  distributions,  random  sampling 
procedures,  and  activity  levels.  The  rate  at  which  Jobs  arrive  and 
are  serviced,  X  and  y  respectively  in  Fig.  1,  are  activity  levels 


that  specify  the  Intensity  of  system  operations.  Figures  3  and  4 
illustrate  some  influences  that  model  structure  exerts  on  a  simulation 
study . 


Fig.  4  —  End  of  service  event 


the  operating  rules  used  to  select  a  job  from  a  waiting  line 
clearly  are  part  of  the  model  structure  and  influence  system  behavior. 

A  complex  model  generally  contains  many  different  kinds  of  operating 
rules:  decision  mechanisms,  search  and  choice  procedures,  and 

scheduling  heuristics  are  some  that  are  found  most  frequently.  We 
have  chosen  a  queue  discipline  to  illustrate  the  effect  of  an  operating 
rule  in  a  model.  A  rule  under  which  jobs  that  have  short  processing 
times  are  selected  first  will  produce  a  sequence  of  I  's  close  to  one 


another  followed  by  sequences  with  greater  values.  The  character  of 
the  output  process  will  he  different  under  this  rule  from  the  output 
under  a  rule  that  selects  jobs  in  another  way. 

A  simulation  model  must  therefore  be  examined  in  two  wcys. 

Its  data  must  be  examined,  both  with  respect  to  the  particular  repre¬ 
sentations  chosen  and  the  way  the  model  selects  samples  in  its  simula¬ 
tion  process;  and  its  structure  must  be  examined  to  see  that  mechanisms 
have  been  chosen  that  produce  correct  system  response.  Both  data  and 
structure  are  important,  both  peso  statistical  problems  in  analysis 
and  evaluation.  Section  1X1  treats  in  detail  the  problems  outlined  in 
the  above  example. 
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XII,  VERIFICATION 


DATA  VERIFICATION 

Inputs  In  sffist  simulation  experiments  consist  of  jobs  of  so®e 
sort,  each  characterized  by  a  sequence  of  random  variables.  In  the 
simple  queueing  model  each  job  is  characterized  by  an  interarrivai 
time  and  a  service  time .  Each  job  affects  the  system  to  an  extent 
determined  in  part  by  the  value#  of  its  corresponding  2-tupls,  In 
general,  simulation  experiments  measure  the  response  of  a  system  to 
different  sequences  of  input  n-tuple*. 

In  most  system  simulation  models  the  elements  of  job  n-tuples 
are  independent  random  variables  and  sequences  of  n-tuples  are  ittde- 
pendent  multivariate  random  variables .  The  n*tuple  elements  are 
trana formations  of  pseudorandom  numbers  drawn  from  a  uniform  distri¬ 
bution  on  the  unit  interval.  To  each  n-tuple  characterizing  a  job, 
the~e  corresponds  an  n-tuple  of  uniformly  distributed  random  variables. 
If  the  model  design  is  proper,  the  elements  of  this  latter  n-tuple 
should  be  independent  and  uniformly  distributed,  and  so  should  be 
che  sequences  of  these  n-tupies. 

Absence  o*  independence  in  generated  samples  implies  that  the 
assumptions  of  the  model  do  not  hold.  Verifying  independence  assump¬ 
tions  is  the  first  statistical  problem  arising  iu  system  simulation 
experiments.  Since  the  tests  of  independence  in  no  way  relate  to 
proposed  system  structure,  one  may  check  the  pseudorandom  number 
generator  quite  separately  from  other  considerations. 

The  moat  important  hypothesis  to  test  is  that  the  pseudorandom 
number  generator  creates  sequences  of  independent  random  variables. 
Suppose  we  collect  m  pseudorandom  numbers.  If  we  divide  the  unit 
interval  into  k  class  intervals  and  let  be  the  number  of  observations 
in  interval  i,  then  for  sufficiently  large  m  we  may  regard  the  statistic 
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X 


2 


m 


2 

as  being  x  distributed  with  (k  -  1)  degrees  of  freedom. 

Mann  and  Wald  [  17  j  s  who  have  studied  the  problem  of  choosing  k 
according  to  soma  4*best  criterion suggest 

.  ,v  2  ,  2,1/5 

k  -  4[2<m  -  1)  /c  ]  . 


where 


C2tt) 


•1/2 


J 


-x2/2 


dx 


o. 


Cochran  [1}  describes  the  sense  in  which  this  choice  of  k  is  best. 

For  our  purposes,  the  Mann  and  Wald  criterion  seems  reasonable.  If 
2  2 

X  exceeds  x>,  i  t  being  the  confidence  level,  we  reject  the  hypo- 

thesis.  This  test  or  an  equivalent  one  has  been  performed  on  most 

pseudorandom  number  generators  and,  therefore,  our  mentioning  It  is 

principally  for  completeness . 

2 

The  x  test  also  applies  in  testing  the  Independence  of  n-tuples, 
but  Instead  of  working  with  the  unit  interval  we  divide  the  n-diraensional 

unit  surface  into  k  n-cubes  of  equal  volume  and  define  x  as  the  number 

til  * 

of  n-tuples  In  the  i  n-cube.  KscLaren  and  Marsaglta  [16]  apply  this 

test  to  the  output  of  several  pseudorandom  number  generators  for  pairs 

and  triples.  Their  results  show  a  number  of  standard  generators  Co 

be  suspect. 

2 

The  x  test  concerns  questions  of  randomness  and  makes  no  use  of 
the  way  Ln  which  a  particular  method  generates  random  numbers.  Coveyou 
and  MacPherson  [5],  who  offer  a  unified  theory  of  the  statistical 
behavior  of  n-tuples  of  pseudorandom  generators,  conclude  that  currently 
there  is  no  better  method  of  generating  n-tuples  than  the  simple  multi¬ 
plicative  congruence  method,  “  r^UCmod  2^) ,  with  a  carefully 
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chosen  multiplier,  U.  They  describe  how  to  choose  the  multiplier,  and 
discuss  the  effects  of  computer  word  length  on  generated  sequences, 

A  departure  from  the  independence  assumption  can  significantly 
affect  experimental  results.  The  following  example  Illustrates  this 
point.  Let  x  and  y  he  pseudorandom  numbers  that  are  suitably  trans¬ 
formed;  g(x)  is  used  as  an  interarrival  fisast  and  a{y)  as  a  service 
time.  Figure  5  shows  the  square  over  which  the  pair  x,  y  are  uniformly 
distributed. 


y 


Fig.  5 


Let  pL  be  the  probability  of  x,  y  being  in  the  1th  square.  If 
pairs  are  independent,  then 

Pt  -  1/A  l  ■  l,  2,  A. 

Suppose,  however,  that  is  greater  than  p^,  p3  and  p^,  If  inter¬ 
arrival  and  service  time3  are  increasing  functions  of  x  and  y, 
respectively,  then  we  would  expect  short  interarrival  times  and  long 
service  times  to  occur  together  more  often  than  theory  suggests.  Ihis 
would  cause  an  upward  bias  in  the  waiting  times  and  queue  lengths 
observed  in  the  simulation  model. 


In  more  complex  models,  the  absence  of  Independence  among  rs- tuples 
is  mors  difficult  to  assess.  Verifying  that  a  data  source  satisfies 
the  independence  assumption  will  always  be  of  value,  hcwever,  if  an 
incorrect  interpretation  of  results  is  to  be  avoided.  References  [5] 
and  [16}  offer  helpful  information  to  an  experimenter  in  choosing  a 
pseudorandom  generator. 

la  some  simulation  experiments,  correlated  sampling  Is  necessary. 
Suppose  we  are  simulating  the  demand  for  aircraft  tires;  then  tire  wear- 
out  is  clearly  related  to  the  number  of  aircraft  landings.  Simulations 
of  economic  behavior  often  contain  autocorrelated  input  processes,  e.g, 
autonomous  investment.  References  [ 7 3  and  [21]  describe  as  chads  for 
generating  correlated  samples  and  [19j  describes  procedures  fot  sampling 
from  two  kinds  of  au toe or re la ted  processes. 

Tocher  [24]  has  pointed  out  that  correlated  sampling  is  often  dif¬ 
ficult  to  perform  because  of  the  onerous  and  often  impossible  task  of 
collecting  sufficient  information  to  describe  desired  distributions . 
Verification  and  validation  should  clearly  be  applied  to  correlated 
sampling.  The  peculiar  circumstances  surrounding  different  kinds  of 
correlated  sampling  make  it  difficult  to  suggest  a  generally  applicable 
method.  Since  all  sampling  ultimately  depends  on  sequenc es  of  indepen¬ 
dent  uniformly  distributed  random  numbers,  the  least  that  can  be  done 
is  to  test  the  hypothesis  that  successive  numbers  and  sequences  of 
numbers  are  independent. 

STRUCTURE  VERIFICATION 

Verifying  the  structure  of  simulation  models  means  examining  sub¬ 
structure  outputs  and  determining  whether  they  behave  acceptably.  One 
value  of  this  exercise  is  that  it  identifies  unwanted  system  behavior. 
Very  minor  simplifying  assumptions  can  generate  output  processes  whose 
behavior  differs  considerably  from  what  is  desired.  Structure  verifica¬ 
tion  is  also  valuable  for  determining  whether  one  may  substitute  an 
analytical  or  simple  simulation  substructure  for  a  complex  one.  This 
may  be  done  if  a  behavioral  equivalence  can  bo  established  between  the 
simple  and  complex  structures.  The  advantages  of  substitution  accrue 
from  the  better  understanding  of  the  analytic  or  simple  simulation  struc¬ 
ture  and  From  savings  in  computation  time  during  simulation. 
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To  snake  behavioral  comparisons,  we  require  a  probability  model. 

The  model  must  be  sufficiently  general  to  include  the  variety  of 
phenomena  encountered  in  simulation  models,  yet  it  must  be  restrictive 
enough  to  permit  reasonably  straightforward  hypothesis  testing.  System 
simulations  usually  are  concerned  with  series  of  interrelated  events 
and  an  appropriate  probability  model  must  explicitly  recognise 
Interrelationships  between  past,  present  and  future  events.  Since 
these  associations  are  time-dependent,  we  refer  to  them  as  intertemporal 
dependence . 

In  ftef.  [10],  the  writers  suggest  the  class  of  covariance  station¬ 
ary  stochastic  processes  as  a  convenient  model  for  studying  siasulation- 
generated  time  series.  The  reasons  for  this  choice  are  the  valuable 
conceptual  insights  that  these  processes  afford  as  well  as  the  ease 
with  which  certain  of  their  sample  statistics  (principally  the  spectrum) 
can  be  used  in  hypothesis  testing.  We  first  formally  define  a  covari¬ 
ance  stationary  process  and  then  discuss  the  meaning  of  some  of  its 
population  parameters. 

Let  Xfc  be  a  random  variable  generated  by  a  simulation  model  and 
recorded  at  time  t.  If  (XtJ  t  «  0*  +  1,  +  2>  +  «}  is  a  stochastic 

process  such  that  £{X  X  )  is  finite  and  independent  of  t  for  all  t, 
then  {Xfc }  is  covariance  stationary.  If  the  random  variables  X^  and 
Xt+T  are  not  Independent  for  some  ?  js'  0,  then  [X^ }  is  autocor related 
or  linearly  dependent.  Ckitput  processes  generally  satisfy  the  covari¬ 
ance  stationary  assumptions  and  exhibit  intertemporal  dependence.  The 
theory  of  covariance  stationary  processes  provides  a  convenient  frame¬ 
work  within  which  to  study  the  nature  and  extent  of  autocorrelation, 
the  principal  fora  of  intertemporal  dependence. 

The  autocovariance  function 


8T  *  E{XtXt-H)  ■  £E<Xt>32 

summarizes  all  information  concerning  the  autocorrelation  present  in 
[X  ).  The  spectrum 

<S5 

g(X)  -  <n)'1  £  V*lXT  0  ^  X  *  n 


provides  the  same  information,  and  in  the  writers'  opinion  is  the  pre¬ 
ferred  function  to  examine  both  for  conceptual  and  statistical  reasons 
[10]. 

The  autocovariance  function  R  measures  the  covariance  between  the 
random  variables  X  and  X  .  For  the  class  of  processes  with  which 

v  trrT 

we  are  concerned  this  function  diminishes,  though  not  necessarily 
mono  ton  leal  ly  as  { -r  |  increases.  This  property  accords  with  reality, 
where  tha  Influence  of  the  past  wear#  off  as  time  elapses.  The  spectrum 
g  permits  us  to  study  mean-square  variation  in  a  series  of  interrelated 
events  in  terms  of  a  continuum  of  frequency  components.  Since 

p" 

Sq  *  j  g<X)dX  , 

0 

we  may  regard  the  variance  as  being  made  up  of  infinitesimal  contri¬ 
butions  g(X)dX  in  small  bands  dX  around  each  frequency.  The  spectrum  g 
may  be  considered  a  variance  decomposition  with  each  component  being 
associated  with  a  specific  frequency.  Low  frequencies  correspond  to 
long  fluctuations  in  [Xt] ;  high  frequencies  correspond  to  rapid  fluc¬ 
tuations.  If  a  peak  occurs  in  a  spectrum,  the  corresponding  frequency 
influences  Che  appearance  of  [Xt}  to  a  greater  extent  than  the  remaining 
frequencies.  A  process  with  a  peak  at  a  ntm-sero  frequency  in  fact  dis¬ 
plays  something  of  a  periodic  appearance  with  its  period  corresponding 
approximately  to  the  frequency  at  which  the  peak  appears. 

When  the  subscript  t  denotes  time  and  Xfc  is  an  observation  at  time 
t,  observations  are  collected  at  equal  intervals  on  the  time  axis.  Since 
t  is  only  an  index,  it  need  not  necessarily  refer  to  time;  any  series  of 
events  can  generate  a  time  series.  For  example,  in  the  simple  queueing 
problem  t  may  denote  the  job  to  receive  service  and  may  be  the 
waiting  time  of  this  job.  Here  {X  ]  is  a  series  of  waiting  times  arranged 
in  the  order  in  which  their  corresponding  jobs  receive  service. 

Interactions  between  input  and  structure  may  often  create  unwanted 
periodicities  in  the  output.  This  possibility  is  not  as  remote  as  one 
would  like  Co  think,  for  Slutaky  [22]  long  ago  showed  that  the  linear 
summation  of  purely  random  events  can  appear  regularly  periodic.  Since 


peaks  la  a  spectrum  correspond  to  periodic  components  in  {Xc)  and 
since  the  sharper  a  peak  is,  the  store  regular  its  periodicity  is, 
examining  the  sample  spectrum  permits  an  experimenter  to  determine 
whether  any  periodicities  exist  and  to  estimate  the  extent  of  their 
regularity. 

Figure  6  shows  the  sample  spectrum  of  queue  length  for  a  single - 
server  queueing  model  with  exponentially  distributed  interarrival  times 
and  constant  service  time,  the  peak  at  0-05  cycles  per  hour  and  its 
harmonics  suggest  the  presence  of  periodicity.  This  behavior  can  be 
explained  as  follows.  Whenever  jobs  are  queueing,  a  periodic  reduction 
in  queue  length  occurs  every  20  hours.  With  a  constant  service  time, 
jobs  emerge  from  the  service  facility  at  a  fixed  periodic  rate,  creating 
a  periodic  appearance.  If  this  efflux  is  the  input  to  another  service 
facility,  then  this  Input  is  periodic  whenever  jobs  are  queueing  in  the 
first  facility. 

Two  points  motivate  eur  concern  about  periodicities.  First,  their 
presence  may  be  contrary  to  our  intentions.  Second,  since  the  output 
of  one  substructure  is  usually  the  input  to  another,  the  effects  of 
periodicity  may  propagate  themselves  throughout  the  remaining  substruc¬ 
tures.  It  is  a  property  of  substructures  that  they  exhibit  the  charac¬ 
teristics  of  electromechanical  systems  and  can  have  a  natural  or  a 
resenant  frequency.  If  a  substructure  is  excited  by  a  frequency  close 
to  its  natural  one,  its  response  at  that  frequency  is  considerably 
exaggerated  compared  to  that  of  others.  The  strength  of  a  periodic 
component  may  therefore  increase  as  it  propagates  through  a  system, 
obscuring  the  behavior  of  remaining  components. 

Conclusions  drawn  from  the  output  of  such  a  system  may  then  be 
misleading.  For  example,  one  might  suppose  that  the  inputs  to  certain 
model  subsystems  are  random  phenomena  whereas  they  actually  appear  in 
a  model  as  regular  or  strongly  periodic  impulses.  If  this  is  so,  rules 
appropriate  for  controlling  randomly  varying  inputs  may  be  Judged  in¬ 
appropriate.  The  performance  of  the  rules  will  be  Judged  in  an  environ¬ 
ment  different  from  that  for  which  they  were  designed. 

As  mentioned  earlier,  economy  of  detail  aids  understanding  and 
saves  computation  time.  The  ease  with  which  computer  simulation  lan¬ 
guages  permit  one  to  describe  complex  behavior  carries  with  it  the 
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Fig.  6  --  Estimated  queue- length  spectrum  with  constant  service  time 
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danger  of  too  much  detail.  Since  a  detailed  model  has  more  built-in 
assumptions  than  a  simple  model,  it  generally  requires  a  longer 
learning  period  for  a  prospective  user.  In  addition,  simulating  these 
details  can  consume  vast  amounts  of  computer  time.  If  several  models 
offer  die  same  response  to  a  given  Input,  the  simplest  model  is 
advantageous.  It  is  desirable  to  test  several  models  to  determine 
the  adequacy  of  each  and  then  choose  the  simplest  among  the  acceptable 
ones . 

Suppose  that  a  complex  model  behaves  as  required  and  we  wish  to 
test  the  equivalence  of  a  simpler  model.  If  at  all  possible,  the 
simpler  model  should  be  compared  with  the  true  environment.  When  this 
cannot  be  done,  the  responses  of  the  simple  and  complex  models  should 
be  compared  for  a  given  input.  The  comparison  teste  the  hypothesis 
that  certain  population  characteristics,  for  example,  means,  variances 
or  spectra,  are  identical  for  both  models. 

Since  intertemporal  dependence  is  often  an  important  characteristic 
of  models,  and  its  mean-square  variation  is  described  by  the  spectrum, 
one  may  compare  mean-square  intertemporal  dependence  by  testing  the 
equivalence  of  spectra  of  two  models.  Jenkins  [15]  and  Fishman  and 
Kivlat  [10]  describe  an  appropriate  testing  procedure. 

While  it  is  true  that  higher-order  effects  may  be  dissimilar  in 
the  two  models,  a  comparison  of  spectra  can  do  much  toward  determining 
whether  further  comparisons  are  useful.  The  test  is  simple.  In 
addition,  when  the  null  hypothesis  of  no  difference  is  rejected,  the 
comparison  of  spectra  permits  one  to  identify  where  in  the  structures 
of  the  two  models  the  departures  occur.  With  this  knowledge,  one  may 
perhaps  modify  the  simple  structure  to  more  closely  match  the  complex 
one . 

Verifying  a  model's  structure  protects  an  experimenter  against 
creating  anomalous  responses,  allows  for  a  justifiably  simple  design, 
and  saves  computer  time.  It  is  a  natural  imperative  to  verify  both 
data  and  structure  before  a  model  is  used  in  order  to  minimize  compli¬ 
cations  that  can  arise  in  the  course  of  an  experiment.  Failure  to 
verify  has  created  more  than  one  embarrassing  situation  in  interpreting 
output , 
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IV.  VALIDATION 


DATA  VALIDATION 

Validating  a  model  means  establishing  that  it  resembles  its 
actual  system  reasonably  well.  If  a  model  describes  some  hypothetical 
system,  then  no  validation  can  occur.  Also,  if  no  numerical  data 
exist  for  an  actual  system,  it  is  not  possible  to  establish  the  quanti¬ 
tative  congruence  of  a  model  with  reality.  The  ideas  of  this  section 
therefore  only  apply  when  numerical  data  exist  for  some  or  all  of  an 
actual  system. 

Sampling  from  a  theoretical  rather  tKan  an  empirical  distribution 
is  generally  considered  preferable,  since  it  exposes  a  simulated  system 
to  the  universe  of  possible  stimuli  rather  than  merely  to  those  that 
have  occurred  in  the  past.  Often,  graphical  methods  suffice  to  judge 
the  validity  of  theoretical  distributions.  If,  for  example,  he  assume 
that  data  have  the  exponential  distribution,  then  we  would  expect  the 
cumulative  empirical  distribution  to  appear  linear  on  serailogarithmic 
paper.  If  the  normal  distribution  is  assumed,  we  would  expect  the 
cumulative  empirical  distribution  to  appear  linear  on  normal  probability 
paper.  Graphic  examination  is  easy  and  revealing.  Whenever  applicable, 

it  should  be  used. 

2 

The  x  test  is  often  proposed  for  testing  the  appropriateness  of 

a  chosen  sampling  distribution,  but  Cochran  [2],  among  other  writers, 

has  shown  the  inadequacy  of  this  test  when  the  sample  size  of  the 

empirical  data  is  limited  and  the  theoretical  distribution  is  skewed. 

As  an  alternative,  Cochran  suggests  the  variance  test,  which  generally 

2 

has  greater  power  than  the  x  goodness-of-f it  test  and  does  away  with 
the  need  for  class  intervals. 

As  an  example,  we  describe  the  variance  test  when  the  null  hypo¬ 
thesis  is  that  a  set  of  independent  observations  {x^,  i  “  l,  2,  N] 

came  from  an  exponential  distribution  with  parameter  X.  Under  this 
hypothesis  we  have 

ECx^  *  1/X,  var(x1)  -  1/X2. 
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As  our  estimate  of  X  we  use  the  maximum  likelihood  estimator 


X  “  $/(Z  x  )  . 
L=l  1 


The  test  statistic  is 


N 


(xt  -  i/D2 
(i/t2) 


2 

which  is  approximately  distributed  as  %  with  (N-l)  degrees  of  freedom. 

No  class  intervals  are  required  in  this  test. 

2 

The  x  and  variance  tests  both  assume  independent  observations, 
an  assumption  that  also  simplifies  Monte  Carlo  sampling.  While  its 
convenience  for  testing  is  apparent,  the  credibility  of  this  assumption 
is  seldom  tested.  If  a  sample  record  is  "sufficiently  long,"  one  may 
estimate  its  spectrum  and  compare  it  with  the  uniform  spectrum  for  an 
uncorrelated  process. 

For  short  records,  spectrum  comparisons  are  not  possible.  Here 
we  suggest  using  nonparametric  tests  of  randomness  which  do  not  require 
an  investigator  to  make  any  assumptions  about  the  underlying  distribu¬ 
tion  of  sample  data.  In  addition,  the  appropriateness  of  the  tests 
do  not  depend  on  the  sample  being  large.  Walsh  [25]  lists  a  number  of 
nonparametric  tests  that  can  be  applied  to  small  samples . 

The  term  "sufficiently  long"  has  an  irritating  quality  about  it 
for  simulation  experimenters.  Seldom  is  enough  prior  Information 
available  to  estimate  how  long  to  run  an  experiment.  Nevertheless, 
most  writers  on  the  statistical  analysis  of  simulation  experiments 
take  the  length  of  the  sample  record  as  adequate  for  the  analyses  they 
propose.  In  [9]  a  two-stage  technique  is  described  wherein  one  may 
estimate  how  long  an  experiment  is  to  be  run.  The  procedure  is  inte¬ 
grated  into  a  test  comparing  means,  but  this  should  pose  no  problem 
in  determining  run  lengths  alone. 
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STKJCIURE  VALIDATION 

Having  tested  assumptions  about  the  data*  there  remains  the  task 
of  validating  the  structure.  If  a  model  resembles  reality  fairly  well, 
we  expect  that  its  simulated  response  to  a  simulated,  but  valid,  input 
should  exhibit  behavior  similar  to  that  observed  for  the  real  system. 

A  spectrum  analysis  is  again  instructive.  Testing  the  homogeneity  of 
spectra,  one  for  the  actual  system's  output  end  the  other  for  the 
simulated  system's  output,  is  easily  aecoapl ished  as  described  in  [10] 
and  [ 153 . 

The  spectrum  comparison  applies  to  testing  the  homogeneity  of 
the  autocorrelation  structure.  Comparing  means  is  also  desirable  since 
we  would  expect  no  difference  if  the  simulation  model  adequately  resem¬ 
bles  the  true  system.  Since  the  output  processes  are  generally  auto- 
correlated,  a  comparison  of  means  requires  more  work  and  care  than  in 
the  case  of  independent  observations. 

The  procedures  in  [8]  can  easily  be  modified  to  compare  the 
means  of  the  simulated  and  real  systems.  The  variance  of  the  sample 
mean  is  shown  to  be  proportional  to  the  spectrum  at  zero  frequency 
and,  hence,  testing  means  and  testing  spectra  show  a  number  of  common 
features. 

Validation,  while  desir.ble,  is  not  always  possible.  Each 
Investigator  has  the  soul-searching  responsibility  of  deciding  how 
much  importance  to  attach  to  his  results.  When  no  experience  is 
available  for  comparison,  an  investigator  is  well  advised  to  proceed 
in  steps,  first  implementing  results  based  on  simple  well-understood 
models  and  then  using  the  results  of  this  implementation  to  design 
ax>re  sophisticated  models  that  yield  stronger  results.  It  is  only 
through  gradual  development  that  a  simulation  can  make  any  claim  to 
approximate  reality.  Large  scale  models  that  are  not  amenable  to 
validation  often  lead  to  perplexing,  if  not  misleading,  results.  This 
occurs  partly  because  the  complexity  of  a  system  confuses  a  model- 
builder  and  partly  because  of  the  tenuous  nature  of  results  based  on 
cascaded  approximations.  Despite  its  difficulty,  effort  must  be 
expended  on  model  validation  --  first,  to  give  credence  to  result3 
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withla  the  validated  range  of  model  operations,  and  second,  to  instill 
confidence  in  extrapolations  beyond  the  range  of  model  experience. 
Verifying  and  validating  a  iaodei  comprise  but  a  small  share  of 
the  statistical  problems  in  a  simulation  experiment ,  Once  an  experi¬ 
menter  accomplishes  them,  he  can  begin  to  exercise  his  model  to  get 
answers.  Kis  purpose  is  to  collect  data,  reduce  them,  and  make  infer¬ 
ences  about  them,  as  efficiently  as  possible.  We  classify  the  statis¬ 
tical  problems  he  encounters  under  problem  analysis.  The  way  he  solves 
these  problems  strongly  influences  the  quality  of  his  results. 


t 
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V.  FROBl&M  AXA1YSI5 


Oae  purpose  of  system  simulation  experiments  is  to  compare  system 
responses  to  different  operating  rules.  In  the  simple  queueing  problem, 
for  example,  we  say  wish  to  compare  the  mean  queue  lengths  caused  by 
given  arrival  and  service  rates  when  different  rules  are  used  to  assign 
priorities  to  jobs.  Another  purpose  is  to  determine  functional  rela¬ 
tionships  between  input  factors  and  system  response.  We  may  simply 
wish  to  get  a  "feel"  for  the  way  in  which  input  and  output  relate,  or 
we  way  wish  to  use  a  determined  functional  relationship  in  a  further 
analysis.  For  example,  we  may  determine  a  functional  relationship 
when  all  inputs  are  unrestricted  and  then  use  this  relationship  to 
find  the  maximum  response  when  constraints  are  placed  on  the  inputs. 

In  some  studies,  both  purposes  enter.  For  simplicity,  we  treat  them 
separately. 

Regardless  of  purpose,  there  are  several  statistical  questions 
common  to  all  problem  analyses  and  to  structural  verification  and 
validation  as  well.  One  question  relates  to  the  choice  of  sampling 
interval:  What  is  the  proper  interval  of  simulated  time  between  suc¬ 
cessive  observations  of  a  process  of  interest?  Another  question  is: 

How  can  results  be  obtained  efficiently  with  a  given  reliability? 

This  topic  is  often  discussed  under  the  heading  of  "variance  reduction 
techniques."  Reliability  estimation  itself  poses  another  statistical 
problem  in  system  simulation  experiments  that  must  be  solved  before 
one  can  determine  how  long  to  run  an  experiment. 

Other  statistical  questions  are  peculiar  to  particular  kinds  of 
experiments.  When  comparing  experinmnts,  one  requires  statistical 
testing  procedures.  When  relating  response  to  input,  one  asks  where 
■  rh-  input  ranges  it  is  bast  to  measure  response  so  f;-.t  its  func¬ 
tional  fora  can  be  most  easily  identified  and  its  parameters  most 
reliably  estimated. 

Measurements  made  in  a  simulation  experiment  can  be  of  two  kinds. 
One  kind  measures  a  system's  response  to  all  possible  situations. 

Here  the  relevant  statistic  is  a  time -Integra ted  average.  The 
other  measures  a  system's  response  to  a  specific  set  of  initial 
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conditlons.  Time- integrated  averages  appear  to  be  the  most  common 
measurement.  The  simulation  literature  is  principally  concerned  with 
them,  and  the  discussion  here  retains  this  emphasis.  The  reader 
should  not  conclude  from  this  that  measurement  of  response  to  initial 
conditions  is  unimportant.  In  particular  the  lack  of  literature  on 
the  subject  should  be  taken  as  a  comment  on  its  specialized  nature, 
not  its  worth.  As  the  use  of  simulation  increases,  there  will  be  more 
concern  for  measurements  of  this  kind  and  more  will  be  written  about 
them.  As  indicated,  the  discussion  from  here  on  will  be  of  experiments 
performed  with  the  first  kind  of  measurements  in  mind.  The  remainder 
of  this  section  uses  the  terms  "time-integrated  average"  and  "sample 
mean"  interchangeably. 

SAMPLING  INTERVAL 

When  t  denotes  time  the  meaning  of  a  time- integrated  average  is 
clear.  When  t  is  a  more  general  ordering  Index,  a  time- integrated 
average  refers  to  the  mean  value  of  a  quantitative  characteristic 
of  a  series  of  events  indexed  on  t.  The  term  time- integrated  average 
remains  appropriate  since  the  ordering  of  events  is  related  to  time. 

If  the  index  denotes  time  then  the  choice  of  sampling  interval 
is  crucial  if  we  hope  to  extract  useful  Information  about  the  auto¬ 
correlation  structure  of  a  process  in  an  efficient  way.  For  our  pur¬ 
poses  a  sampling  interval  should  be  small  enough  bo  that  within  it  a 
process  changes  little,  if  av  all.  Process  activity,  not  chronologi¬ 
cal  time,  dictates  the  choice  of  sampling  interval.  For  each  experi¬ 
ment,  other  than  replications,  it  la  vise  to  check  the  adequacy  of  the 
sampling  interval,  since  too  small  an  interval  causes  redundancy  in 
the  data  and  too  large  an  interval  loses  information.  Biasing  an  in¬ 
terval  downward  is  more  desirable  than  biasing  it  upward,  since  redun¬ 
dant  data  are  far  leas  harmful  than  lost  information. 

When  t  denotes  an  event  in  an  ordered  series,  the  role  of  the 
sampling  interval  is  changed.  Since  we  simply  collect  an  observation 
every  time  an  event  occurs,  it  would  seem  that  we  could  avoid  choosing 
a  sampling  interval.  It  may  occur,  however,  that  successive  events  are 
so  hLghly  correlated  that  collecting  information  on  each  event  is  highly 
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redundant.  When  this  is  the  case  a  judicious  choice  of  ss&pliAg 
interval  reduces  the  lumber  of  observations  without  sacrificing  any 
significant  info  mat  ion . 

VARIANCE  MOOCTiOH  jSCH&lQUES 

It  is  naturally  of  interest  to  obtain  experimental  results  with 
specified  reliability  at  minimum  cost.  Envelopment  of  variance  reduc¬ 
tion  techniques  was  in  fact  the  principal  statistical  activity  its  the 
early  days  of  computer  simulation  (Monte  Carlo)  experiments,  The 
importance  of  this  activity  continues  to  grow  with  the  increasing 
complexity  of  experiments  and  their  concomitant  consumption  of  computer 
time. 

Hanmsersley  and  Handscomb  [ll]  discuss  several  variance  reduction 
techniques,  among  which  the  method  of  antithetic  variates  appears 
easiest  to  apply.  Page  [20]  shows  its  use  in  a  simulated  queueing 
problem.  Briefly,  by  generating  §,  a  uniformly  distributed  random 
number,  in  one  replication  of  an  experiment  and  generating  1-f  in  a 
second  replication,  the  method  induces  negative  correlation  between 
the  responses  obtained  in  both  replications.  The  variance  of  the 
average  response  of  the  two  replications  is  consequently  smaller  than 
it  would  be  if  the  replications  were  independent.  Antithetic  variates 
may  also  be  used  with  more  than  two  replications. 

When  the  comparison  of  experiments  is  the  purpose  of  a  siawlation 
exercise,  one  may  inq>rcve  the  efficiency  of  the  data-gathering  proce¬ 
dure  in  another  way.  When  testing  the  difference  of  two  means,  for 
example,  one  may  reduce  the  variance  of  the  difference  by  choosing 
the  sample  sizes  as  functions  of  the  variances  of  the  individual  sample 
means,  the  computer  times  required  to  collect  one  observation  in  each 
experiment,  and  the  degree  of  correlation  between  the  sample  means. 
Inducing  a  positive  correlation  between  the  sample  means  reduces  the 
variance  of  their  difference.  This  can  be  done,  in  some  cases,  by 
using  the  same  set  of  random  numbers  for  both  experiments. 

As  mentioned  in  Sec.  IV,  the  choice  of  the  number  of  observations 
to  collect  in  each  experiment  is  a  major  influence  in  minimizing  the 
computer  time  needed  to  meet  a  specified  level  of  accuracy.  The 
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two-st age  procedure  given  in  [9]  offers  a  reasonably  straightforward 
way  of  coming  close  to  the  most  efficient  sample  sizes.  When  the  sample 
sites  are  chosen  close  to  the  efficient  solution,  a  major  saving  In 
computer  time  accrues . 

ESTIMATING  RELIABILITY 

Since  experimental  results  are  random  variables,  it  is  important 
that  their  reliability  as  estlsnates  of  population  parameters  be  stated 
explicitly.  Failure  to  do  so  obscures  the  fact  that  some  results  may 
be  better  than  others.  In  addition,  omitting  reliability  measures 
makes  It  impossible  to  determine  how  much  longer  to  run  an  experiment 
in  order  to  improve  its  reliability  by  some  fixed  amount. 

Variance  reduction  techniques  permit  us  to  reduce  the  computer 
time  necessary  to  obtain  a  result  with  a  given  reliability.  We  must 
also  have  a  way  of  estimating  the  reliability  of  a  result.  This  has 
long  been  a  major  problem  area  in  simulation  experiments. 

If  a  sampling  interval  fs  chosen  so  that  observations  are  indepen¬ 
dent,  then  the  variance  of  a  time-integrated  average  or  sample  mean 
is  simply  the  population  variance  divided  by  T,  the  number  of  observa¬ 
tions,  In  general,  since  simulation  data  are  au toco r related ,  the 
above  approach  requires  finding  a  sampling  interval  such  that  succes¬ 
sive  observations  are  reasonably  independent.  Mechanic  and  McKay  [18] 
have  investigated  this  approach. 

If,  however,  one  treats  a  simulated  process  as  a  covariance 
stationary  stochastic  process  (which  it  generally  is),  then  the 
variance  of  the  sample  mean  is  ng(0)/T  where  the  function  g  is  defined 
in  Sec.  Ill,  and  T  Is  the  length  of  the  simulation  run.  A  procedure 
for  estimating  g(0)  Is  given  in  [8],  but  unfortunately  it  cannot 
easily  be  incorporated  into  the  experimental  run  itself. 

Another  approach  is  to  sum  sample  means  from  independent  replica¬ 
tions  of  the  same  experiment.  The  variance  of  this  sum  is,  of  course, 
inversely  proportional  to  the  number  of  replications.  Using  antithetic 
variates  can  reduce  the  variance  even  tore  by  inducing  negative  corre¬ 
lation  between  sample  taear.s. 
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COHPARXSON  OF  ESPERIMENTS 

Id  an  experiment,  response  is  generally  a  large-sample,  time- 
integrated  average  that  satisfies  the  conditions  for  asymptotic  normal¬ 
ity.  This  fact  greatly  simplifies  testing  the  difference  of  two  means 
obtained  under  different  operating  rules,  since  the  difference  of  the 
sample  means  is  also  asymptotically  normal .  Let  the  subscripts  1  and 
2  denote  experiments  1  and  2,  respectively.  Then  for  a  given  signifi¬ 
cance  level  a  and  tolerance  $,  we  have,  under  the  null  hypothesis 
of  no  difference  In  the  means, 

prob  (j^  -  X2!  s  6)  -  i  .  a. 

To  test  the  null  hypothesis,  we  require  reasonably  accurate  estimates 
of  the  variances  of  the  sample  means.  These  can  be  obtained  by  proce¬ 
dures  described  in  [9], 

The  comparison  just  described  is  the  one  most  commonly  applied  in 
the  analysis  of  experimental  results.  Multiple  comparisons  and  order¬ 
ing  procedures  are  desirable  when  more  than  two  sets  of  operating  rules 
are  being  considered.  Their  appropriate  statistical  procedures  are 
found  in  texts  on  Che  analysis  of  variance.  To  our  knowledge,  no 
study  has  yet  appeared  that  makes  a  substantive  contribution  toward 
adapting  these  procedures  to  the  peculiar  environment  of  computer 
simulation  experiments. 

RESPONSE  MEASUREMENT 

In  comparing  experiments,  one  is  concerned  with  the  response  of 
a  system  to  different  qualitative  factors,  such  as  operating  rules. 
Alternatively,  one  may  examine  the  system's  response  under  given  oper¬ 
ating  rules  to  changes  in  quantitative  factors,  such  as  different  input 
activity  levels.  We  refer  Co  this  analysis  as  response  measurement. 

Its  purpose  is  to  find  a  functional  form  relating  the  variable  param¬ 
eters  of  an  input  process  to  an  observed  output,  and  to  estimate  the 
coefficients  of  the  functional  form. 
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Consider  a  simulation  with  one  input  x  and  one  output  y.  For 
each  experiment,  x  assumes  a  fixed  value  that  is  known  exactly,  whereas 
y  assumes  a  value  from  &  probability  distribution  whose  parameters 
are  functions  of  x;  y  is  a  random  variable.  In  a  queueing  problem  x 
might  be  the  mean  arrival  rate  and  y  the  sample  mean  number  of  jobs 
in  queue.  If*  for  estimation  purposes,  we  use  the  linear  least-squares 
method,  our  functional  relationship  for  the  observation  is 


Y  *>  a  +  px  +  ej 


f  <x±> 


Y.  -  f^). 


To  derive  the  best  linear  unbiased  estimates  of  a  and  3  with  the  linear 
least -squares  method,  we  require 

E  (£j>  -  0 


E  (et€j)  >0,  L  4  j 


varCY^)  »  c  for  all  i. 


Some  commonly  used  functional  forms  are  listed  below. 
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For  a  correctly  chosen  form,  the  relationship  between  f(x)  and  f{y) 
will  appear  linear.  Linear,  semilog,  and  log-log  graph  paper  may  be 
used  to  find  which  relationship  is  most  appropriate.  Other  form  may 
be  examined,  but  for  the  moment  we  assume  that  one  of  the  above  iorss 
will  hold.  Hoerl  [14]  describes  several  techniques  for  identifying 
the  functional  form  that  linearizes  the  relt  lonship  between  x  and  y„ 

It  is  convenient  to  distinguish  between  two  kinds  of  observations 
those  collected  to  determine  the  appropriate  functional  form,  and 
those  collected  to  estimate  a  and  8  with  a  given  level  of  accuracy. 

The  first  set  is  a  subset  of  the  second. 

To  satisfy  the  above  regression  model,  we  require  all  y^*s  to  be 
independent  and  have  a  common  variance.  Independence  can  be  gained 
by  using  different  random  number  sequences  in  successive  simulation 
runs  with  each  set  of  input  activity  levels.  For  a  given  variance, 
the  proper  length  if  a  simulated  run  may  be  estimated  by  the  two-3tage 
procedure  to  which  we  have  already  alluded. 

To  find  a  functional  form  it  is  necessary  to  take  observations 
for  a  number  of  input  activity  levels  within  the  range  of  activity 
levels  being  considered.  As  one  would  expect,  the  number  of  such 
observations  is  inversely  related  to  the  variance  of  the  observations. 
The  more  reliable  the  observations,  the  more  confidence  one  can  place 
in  having  identified  an  appropriate  functional  form  for  a  given  number 
of  observations. 

Once  an  appropriate  functional  form  is  found,  one  uses  the  obser¬ 
vations  already  collected  to  estimate  the  coefficients.  Additional 
observations  way  be  collected  and  used  to  improve  the  reliability  of 
the  estimates  .  The  objective  at  this  step  is  efficiency  —  the 
conservation  of  computer  time.  If  the  computer  times  required  to 
collect  all  y^'s  with  equal  variance  are  the  same,  taking  additional 
observations  at  the  ends  of  the  x  range  minimizes  the  computer  time 
necessary  to  improve  the  reliability  of  the  eetimatert  coefficients 
by  a  given  amount.  In  general,  the  computer  times  required  to  collect 
observations  with  common  variance  do  differ  and,  hence,  the  choice  of 
where  to  collect  observations  is  not  so  simple.  Litti  ,  if  anything, 
has  been  published  about  this  problem.  Its  solution  will  undoubtedly 
improve  the  efficient  performance  of  simulation  experiments. 
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It  may  occur  that  a  priori  theory  suggests  a  model  of  the  form, 

8 

\  -  «  +  J2  ei  4  +  *i- 

j-i 

This  model,  unlike  the  one  above,  does  not  exclusively  take  observa¬ 
tions  at  the  end  points  of  the  independent  variable  range  to  minimise 
the  sample  site  needed  for  a  given  accuracy.  The  points  at  which 
observations  should  be  taken  are  given  by  the  seros  of  a  polynomial 
which  is  the  integral  of  one  of  the  Legendre  polynomials  [13], 

Response  surface  exploration,  optimum  seeking  methods,  and  sequen¬ 
tial  experimentation  are  all  topica  germane  to  the  analysis  of  computer 
simulation  experiments.  Cochran  and  Cox  [3]  describe  the  principles 
of  response  surface  methodology,  and  Hill  and  Hunter  [12]  list  a 
number  of  papers  covering  different  aspects  of  the  topic.  Draper  and 
Smith  [6]  describe  procedures  for  applying  a  variety  of  linear  regres¬ 
sion  analyses.  Wilde  [26]  describes  simple  methods  for  finding 
maxima  end  minims.  Cochran  and  Cox  also  discuss  sequential  experimen¬ 
tation.  Although  these  methods  contribute  significantly  to  the 
statistical  analysis  of  experiments,  they  remain  to  be  integrated  into 
a  general  procedure  that  takes  due  cognisance  of  the  peculiarities 
of  computer  simulation  experiments. 
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